Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add KIP1 patching support (with 2 FS patches included) #51

Merged
merged 1 commit into from
Jul 29, 2018

Conversation

rajkosto
Copy link
Contributor

Currently supported patches are nogc (Disable GC comms) and nosigchk (Allow unsigned NCA loading).
They are supplied via launch configuration's kip1patch variable, of which there can be multiple, or they can be comma separated
The README and sample INI have also been updated to reflect this.

Currently the CTCaer logo must be disabled, otherwise the payload exceeds maximum size. Switching to -Os decreases size significantly so the logo can be enabled, but this slows down FS KIP1 decompression from 3.5s to 10.1s !! So some compromise must be made in between (maybe removing some really rarely used tools/dumping options?)

@CTCaer
Copy link
Owner

CTCaer commented Jul 23, 2018

Thanks I'll take a look on this when possible. Currently on vacations.

About the size:
I have 3 things in mind

  • either remove some patching mechanisms and let the various cfw's payloads do the job
  • or make some functions or all patches modular
  • or compress some big size functions

We'll see.. For now I want to continue having only one binary.

BTW, what's the payload size with this PR and the menu logo?

@rajkosto
Copy link
Contributor Author

rajkosto commented Jul 23, 2018

these are all with -O2

119285/126296 without ctcaer logo (which is how im building it atm)
128197/126296 with ctcaer logo (oof)

simply stubbing print_mmc_info() gets it to
126145/126296 bytes which is barely enough to pass.

compiling most functions with -Os except some time critical ones might be the way to go, if it doesnt become an organizational nightmare
for shaving off the required 2KB maybe just restructuring how the SDRAM parameters are compressed might do the trick ?

@CTCaer
Copy link
Owner

CTCaer commented Jul 23, 2018

Almost 2 kilos, hmm...

Thanks.

Edit:
Sdram functions are the biggest, yeah.
Another thing I was thinking was to compress the whole binary. But this needs a block compression algorithm which apparently adds too much fat and has a lower compression ratio.

@rajkosto
Copy link
Contributor Author

rajkosto commented Jul 23, 2018

you know where hekate switches the stack to SDRAM ? you can use that opportunity to make minimal code that runs at boot, and the rest can just be a binary embedded into rodata that you lz4 decompress into ram and jump to at that point (but this requires splitting everything into stage1 and stage2, maybe not worth doing if you can just shave off a KB by combining all the sdram parameters into one lz4 block you decompress and then choose)

@Valeri0p
Copy link

Offtopic: thanks guys, I think hekate will be superior that account-pwner ripoff program.
The switch Needs hackers like you...sorry for the intrusion 😅

@martinnnnnnnnnnnnnnnnnnn

I believe modularity is the best solution here. No features need to be removed.

@rajkosto
Copy link
Contributor Author

rajkosto commented Jul 23, 2018

tried with all compression done like so
lz4 -9 -B4 --no-frame-crc --content-size ctcaer_menu_logo.bin ctcaer_menu_logo.lz4

Doing this with the MENU_LOGO enabled reduces the size from 128197 to 127437 bytes, still over 126296, but a cost saving of 760 bytes total (which means this pull request with MENU_LOGO still enabled is only 1141 bytes over)

@CTCaer
Copy link
Owner

CTCaer commented Jul 23, 2018

There are several reasons I keep this logo in the menu.
One of it is to give me size challenges to optimize and better the code.

Also, I did the tests before with lz4.
Size changes are negligible versus lz77, but speed is another story.
LZ4 was around 4.85 times slower than LZ77, a 77' algo..
So this can't save it. lp0 code as a LZ77 blob will have a big impact, though (no need for LZ4 here also).

Well don't worry about that. It's out of the scope of this PR.
So it's better to reverse head to 2b21075 and I'll find a better way (except if you beat me to it).
I like to do tests (even failed ones) anyway.

@rajkosto
Copy link
Contributor Author

rajkosto commented Jul 23, 2018

I don't find this lz4 noticably slower for what it's used for (the logos and sdram config) ? If you really want slower/more memory hungry, i have LZMA... which cant even be used for sdram because of the memory requirements, but it would make all your logos like 2x smaller.

EDIT: reverted, lz4 is in its own branch if you want it

@rajkosto
Copy link
Contributor Author

This new BLZ decompression code is 2.5x faster (1472ms vs previous 3458ms on O2 and 8114ms vs previous 13458ms on Os) and also consumes 100 bytes less code space !

@rajkosto
Copy link
Contributor Author

rajkosto commented Jul 24, 2018

By compressing the boot/menu logo with BLZ, this is finally below the payload size threshold !
(LZ77 is still in there and used for the memory parameters, since it's way more efficient for that)

Payload size is 126089
Max size is 126296 Bytes.

To compress i used
blzcomp.exe ctcaer_menu_logo.bin ctcaer_menu_logo.blz
which you can find here: https://github.com/rajkosto/memloader/blob/master/tools/blzcomp.cpp

@CTCaer
Copy link
Owner

CTCaer commented Jul 25, 2018

I'm really eager to test there, but it's difficult for me to follow and test these with just a phone, so here are 2 questions:

  • Is it OK that the kips are not compressed again? (pk2.1 has a limit in size.)
  • How's the speed and size, lz77 vs blz, with a specific set of data? (I normally test these with my 4mb BOOT0 part.)

Edit:
Max pkg2 size is almost 8MB.

@rajkosto
Copy link
Contributor Author

rajkosto commented Jul 25, 2018

It's ok for the loader, don't know about the size, FS is 1.37MB compressed but 4.2MB uncompressed for the largest one, however recompression would take infeasibly long on the BPMP its not even an option (a better way would be to only uncompress .text section of FS since that's where the patches are applied to, ill take note of that)

the size performance of each compressor is very dependent on the data you feed it, BLZ fared very badly on the sdram parameters compared to even LZ77, but it beat out LZ4 (which beats out LZ77) for your boot images, so use the one that compresses your data best, silly to test them out on BOOT0 which is a bunch of incompressible data like signatures and encrypted pkg1, with long strings of zeroes of padding

@rajkosto
Copy link
Contributor Author

Decompressing only .text of 5.1.0 exfat kip1 makes the size grow from 1.370 to 2.103 MB only, much better than decompressing everything to 4.2 MB, so i am definitely going to separate out the patches by section and only decompress ones that are absolutely necessary

@CTCaer
Copy link
Owner

CTCaer commented Jul 25, 2018

I like the direction you are going. Normally not needed but for projects like this it's a great optimization!

@rajkosto
Copy link
Contributor Author

rajkosto commented Jul 25, 2018

Payload size is 126269
Max size is 126296 Bytes.
https://images.sshnuke.net/sweating%20miller.jpg

anyway, FS kip1 decompression time is now just 654ms and only an additional 800KB is used compared to compressed, so its about as good as its gonna get.

Copy link
Owner

@CTCaer CTCaer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not hang inside secondary functions (like pkg2 ones).

Also when hanging or returning from main functions, we must always umount the sd card. Even if we only read from it. This helps with avoiding corruption and voltage spikes on forced power off or removal of card.

Also please squash your commits if possible.

ipl/pkg2.c Outdated
@@ -292,6 +482,255 @@ void pkg2_merge_kip(link_t *info, pkg2_kip1_t *kip1)
pkg2_add_kip(info, kip1);
}

void pkg2_decompress_kip(pkg2_kip1_info_t* ki, u32 sectsToDecomp)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function should return an int

ipl/pkg2.c Outdated
{
u32 compClearMask = ~sectsToDecomp;
if ((ki->kip1->flags & compClearMask) == ki->kip1->flags)
return; //already decompressed, nothing to do
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return 0

ipl/pkg2.c Outdated
if (blz_uncompress_srcdest(srcDataPtr, compSize, dstDataPtr, outputSize) == 0)
{
gfx_printf(&gfx_con, "%kERROR decomping sect %d of %s KIP!%k\n", 0xFFFF0000, sectIdx, (char*)hdr.name, 0xFFCCCCCC);
while (1) { }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

free(newKip);
return 1;

ipl/pkg2.c Outdated
free(ki->kip1);
ki->kip1 = newKip;
ki->size = newKipSize;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return 0

ipl/pkg2.c Outdated
#ifdef DEBUG_PRINTING
u32 preDecompTime = get_tmr_us();
#endif
pkg2_decompress_kip(ki, bitsAffected);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (pkg2_decompress_kip(ki, bitsAffected))
	return "decompress_failed";

ipl/pkg2.c Outdated
if (memcmp(&kipSectData[currOffset], currPatch->srcData, currPatch->length) != 0)
{
gfx_printf(&gfx_con, "%kDATA MISMATCH FOR PATCH AT OFFSET 0x%x!!!%k\n", 0xFFFF0000, currOffset, 0xFFCCCCCC);
while (1) {} //MUST hang here, means the patch writer messed up
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return "data_mismatch"; or sth.

ipl/hos.c Outdated
if (unappliedPatch != NULL)
{
gfx_printf(&gfx_con, "%kREQUESTED PATCH '%s' NOT APPLIED!%k\n", 0xFFFF0000, unappliedPatch, 0xFFCCCCCC);
while(1) {} //MUST hang here, because if user requests 'nogc' but it's not applied, their GC controller gets updated!
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should either free stuff and return as failed or
add sd_unmount(); before while.

@CTCaer
Copy link
Owner

CTCaer commented Jul 29, 2018

Additionally I was amazed to see that BLZ is almost as fast in decompression as LZ77 and has better compression than LZ4 in normal compressible data (like logos)!

@rajkosto
Copy link
Contributor Author

Payload size is 126293
Max size is 126296 Bytes.

What do you want squashed ? Everything into one or just some of them ? Squashing on the PR branch would pretty much ruin this entire discussion thread as it points to code and commits, which is why maintainers usually just squash on merge if needed.

@CTCaer
Copy link
Owner

CTCaer commented Jul 29, 2018

Everything into one.

I don't mind squashing it myself.
And because the requested changes are done, the discussion is now obsolete.
A programmer can understand why the code now looks like this based on the comments here.

So it's your choice (raj authored and ctc committed or raj committed).

@rajkosto
Copy link
Contributor Author

Squashed.

@CTCaer
Copy link
Owner

CTCaer commented Jul 29, 2018

Thanks.
Now everything except warmboot, is patch-able.

@CTCaer CTCaer merged commit e9d1256 into CTCaer:master Jul 29, 2018
Huntereb pushed a commit to Huntereb/hekate that referenced this pull request Oct 7, 2019
Add KIP1 patching support (with 2 FS patches included)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants